88 research outputs found

    Scaling Analysis of Affinity Propagation

    Get PDF
    We analyze and exploit some scaling properties of the Affinity Propagation (AP) clustering algorithm proposed by Frey and Dueck (2007). First we observe that a divide and conquer strategy, used on a large data set hierarchically reduces the complexity O(N2){\cal O}(N^2) to O(N(h+2)/(h+1)){\cal O}(N^{(h+2)/(h+1)}), for a data-set of size NN and a depth hh of the hierarchical strategy. For a data-set embedded in a dd-dimensional space, we show that this is obtained without notably damaging the precision except in dimension d=2d=2. In fact, for dd larger than 2 the relative loss in precision scales like N(2−d)/(h+1)dN^{(2-d)/(h+1)d}. Finally, under some conditions we observe that there is a value s∗s^* of the penalty coefficient, a free parameter used to fix the number of clusters, which separates a fragmentation phase (for s<s∗s<s^*) from a coalescent one (for s>s∗s>s^*) of the underlying hidden cluster structure. At this precise point holds a self-similarity property which can be exploited by the hierarchical strategy to actually locate its position. From this observation, a strategy based on \AP can be defined to find out how many clusters are present in a given dataset.Comment: 28 pages, 14 figures, Inria research repor

    Multiclass Semi-Supervised Learning on Graphs using Ginzburg-Landau Functional Minimization

    Full text link
    We present a graph-based variational algorithm for classification of high-dimensional data, generalizing the binary diffuse interface model to the case of multiple classes. Motivated by total variation techniques, the method involves minimizing an energy functional made up of three terms. The first two terms promote a stepwise continuous classification function with sharp transitions between classes, while preserving symmetry among the class labels. The third term is a data fidelity term, allowing us to incorporate prior information into the model in a semi-supervised framework. The performance of the algorithm on synthetic data, as well as on the COIL and MNIST benchmark datasets, is competitive with state-of-the-art graph-based multiclass segmentation methods.Comment: 16 pages, to appear in Springer's Lecture Notes in Computer Science volume "Pattern Recognition Applications and Methods 2013", part of series on Advances in Intelligent and Soft Computin

    Uncertainty quantification in graph-based classification of high dimensional data

    Get PDF
    Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of, a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The methods are all based around the graph formulation of semi-supervised learning. We provide a unified framework which brings together a variety of methods which have been introduced in different communities within the mathematical sciences. We study probit classification in the graph-based setting, generalize the level-set method for Bayesian inverse problems to the classification setting, and generalize the Ginzburg-Landau optimization-based classifier to a Bayesian setting; we also show that the probit and level set approaches are natural relaxations of the harmonic function approach introduced in [Zhu et al 2003]. We introduce efficient numerical methods, suited to large data-sets, for both MCMC-based sampling as well as gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semi-supervised learning algorithms.Comment: 33 pages, 14 figure

    Action Recognition with a Bio--Inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions

    Get PDF
    International audienceHere we show that reproducing the functional properties of MT cells with various center--surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio--inspired model of the motion pathway in primates: It is a feedforward model restricted to V1-MT cortical layers, cortical cells cover the visual space with a foveated structure, and more importantly, we reproduce some of the richness of center-surround interactions of MT cells. Interestingly, as observed in neurophysiology, our MT cells not only behave like simple velocity detectors, but also respond to several kinds of motion contrasts. Results show that this diversity of motion representation at the MT level is a major advantage for an action recognition task. Defining motion maps as our feature vectors, we used a standard classification method on the Weizmann database: We obtained an average recognition rate of 98.9%, which is superior to the recent results by Jhuang et al. (2007). These promising results encourage us to further develop bio--inspired models incorporating other brain mechanisms and cortical layers in order to deal with more complex videos

    Music genre profiling based on Fisher manifolds and Probabilistic Quantum Clustering

    Get PDF
    Probabilistic classifiers induce a similarity metric at each location in the space of the data. This is measured by the Fisher Information Matrix. Pairwise distances in this Riemannian space, calculated along geodesic paths, can be used to generate a similarity map of the data. The novelty in the paper is twofold; to improve the methodology for visualisation of data structures in low-dimensional manifolds, and to illustrate the value of inferring the structure from a probabilistic classifier by metric learning, through application to music data. This leads to the discovery of new structures and song similarities beyond the original genre classification labels. These similarities are not directly observable by measuring Euclidean distances between features of the original space, but require the correct metric to reflect similarity based on genre. The results quantify the extent to which music from bands typically associated with one particular genre can, in fact, crossover strongly to another genre

    Minimal-cut model composition

    No full text
    Constructing new, complex models is often done by reusing parts of existing models, typically by applying a sequence of segmentation, alignment and composition operations. Segmentation, either manual or automatic, is rarely adequate for this task, since it is applied to each model independently, leaving it to the user to trim the models and determine where to connect them. In this paper we propose a new composition tool. Our tool obtains as input two models, aligned either manually or automatically, and a small set of constraints indicating which portions of the two models should be preserved in the final output. It then automatically negotiates the best location to connect the models, trimming and stitching them as required to produce a seamless result. We offer a method based on the graph theoretic minimal cut as a means of implementing this new tool. We describe a system intended for both expert and novice users, allowing easy and flexible control over the composition result. In addition, we show our method to be well suited for a variety of model processing applications such as model repair, hole filling, and piecewise rigid deformations. 1
    • …
    corecore